Saint Helena, Ascension and Tristan da Cunha
Information-Driven Search and Track of Novel Space Objects
Wolf, Trevor N., Jones, Brandon A.
Space surveillance depends on efficiently directing sensor resources to maintain custody of known catalog objects. However, it remains unclear how to best utilize these resources to rapidly search for and track newly detected space objects. Provided a novel measurement, a search set can be instantiated through admissible region constraints to inform follow-up observations. In lacking well-constrained bounds, this set rapidly spreads in the along-track direction, growing much larger than a follow-up sensor's finite field of view. Moreover, the number of novel objects may be uncertain, and follow-up observations are most commonly corrupted by false positives from known catalog objects and missed detections. In this work, we address these challenges through the introduction of a joint sensor control and multi-target tracking approach. The search set associated to a novel measurement is represented by a Cardinalized Probability Hypothesis Density (CPHD), which jointly tracks the state uncertainty associated to a set of objects and a probability mass function for the true target number. In follow-up sensor scans, the information contained in an empty measurement set, and returns from both novel objects and known catalog objects is succinctly captured through this paradigm. To maximize the utility of a follow-up sensor, we introduce an information-driven sensor control approach for steering the instrument. Our methods are tested on two relevant test cases and we provide a comparative analysis with current naive tasking strategies.
Factored Conditional Filtering: Tracking States and Estimating Parameters in High-Dimensional Spaces
Chen, Dawei, Yang-Zhao, Samuel, Lloyd, John, Ng, Kee Siong
This paper introduces factored conditional filters, new filtering algorithms for simultaneously tracking states and estimating parameters in high-dimensional state spaces. The conditional nature of the algorithms is used to estimate parameters and the factored nature is used to decompose the state space into low-dimensional subspaces in such a way that filtering on these subspaces gives distributions whose product is a good approximation to the distribution on the entire state space. The conditions for successful application of the algorithms are that observations be available at the subspace level and that the transition model can be factored into local transition models that are approximately confined to the subspaces; these conditions are widely satisfied in computer science, engineering, and geophysical filtering applications. We give experimental results on tracking epidemics and estimating parameters in large contact networks that show the effectiveness of our approach.
MIRAI: Evaluating LLM Agents for Event Forecasting
Ye, Chenchen, Hu, Ziniu, Deng, Yihe, Huang, Zijie, Ma, Mingyu Derek, Zhu, Yanqiao, Wang, Wei
Recent advancements in Large Language Models (LLMs) have empowered LLM agents to autonomously collect world information, over which to conduct reasoning to solve complex problems. Given this capability, increasing interests have been put into employing LLM agents for predicting international events, which can influence decision-making and shape policy development on an international scale. Despite such a growing interest, there is a lack of a rigorous benchmark of LLM agents' forecasting capability and reliability. To address this gap, we introduce MIRAI, a novel benchmark designed to systematically evaluate LLM agents as temporal forecasters in the context of international events. Our benchmark features an agentic environment with tools for accessing an extensive database of historical, structured events and textual news articles. We refine the GDELT event database with careful cleaning and parsing to curate a series of relational prediction tasks with varying forecasting horizons, assessing LLM agents' abilities from short-term to long-term forecasting. We further implement APIs to enable LLM agents to utilize different tools via a code-based interface. In summary, MIRAI comprehensively evaluates the agents' capabilities in three dimensions: 1) autonomously source and integrate critical information from large global databases; 2) write codes using domain-specific APIs and libraries for tool-use; and 3) jointly reason over historical knowledge from diverse formats and time to accurately predict future events. Through comprehensive benchmarking, we aim to establish a reliable framework for assessing the capabilities of LLM agents in forecasting international events, thereby contributing to the development of more accurate and trustworthy models for international relation analysis.
A lexicon obtained and validated by a data-driven approach for organic residues valorization in emerging and developing countries
Rakotomalala, Christiane, Paillat, Jean-Marie, Feder, Frédéric, Avadí, Angel, Thuriès, Laurent, Vermeire, Marie-Liesse, Médoc, Jean-Michel, Wassenaar, Tom, Hottelart, Caroline, Kieffer, Lilou, Ndjie, Elisa, Picart, Mathieu, Tchamgoue, Jorel, Tulle, Alvin, Valade, Laurine, Boyer, Annie, Duchamp, Marie-Christine, Roche, Mathieu
The text mining method presented in this paper was used for annotation of terms related to biological transformation and valorization of organic residues in agriculture in low and middle-income country. Specialized lexicon was obtained through different steps: corpus and extraction of terms, annotation of extracted terms, selection of relevant terms.
Scalable Extraction of Training Data from (Production) Language Models
Nasr, Milad, Carlini, Nicholas, Hayase, Jonathan, Jagielski, Matthew, Cooper, A. Feder, Ippolito, Daphne, Choquette-Choo, Christopher A., Wallace, Eric, Tramèr, Florian, Lee, Katherine
This paper studies extractable memorization: training data that an adversary can efficiently extract by querying a machine learning model without prior knowledge of the training dataset. We show an adversary can extract gigabytes of training data from open-source language models like Pythia or GPT-Neo, semi-open models like LLaMA or Falcon, and closed models like ChatGPT. Existing techniques from the literature suffice to attack unaligned models; in order to attack the aligned ChatGPT, we develop a new divergence attack that causes the model to diverge from its chatbot-style generations and emit training data at a rate 150x higher than when behaving properly. Our methods show practical attacks can recover far more data than previously thought, and reveal that current alignment techniques do not eliminate memorization.
Self-prompted Chain-of-Thought on Large Language Models for Open-domain Multi-hop Reasoning
Wang, Jinyuan, Li, Junlong, Zhao, Hai
In open-domain question-answering (ODQA), most existing questions require single-hop reasoning on commonsense. To further extend this task, we officially introduce open-domain multi-hop reasoning (ODMR) by answering multi-hop questions with explicit reasoning steps in open-domain setting. Recently, large language models (LLMs) have found significant utility in facilitating ODQA without external corpus. Furthermore, chain-of-thought (CoT) prompting boosts the reasoning capability of LLMs to a greater extent with manual or automated paradigms. However, existing automated methods lack of quality assurance, while manual approaches suffer from limited scalability and poor diversity, hindering the capabilities of LLMs. In this paper, we propose Self-prompted Chain-of-Thought (SP-CoT), an automated framework to mass-produce high quality CoTs of LLMs, by LLMs and for LLMs. SP-CoT introduces an automated generation pipeline of high quality ODMR datasets, an adaptive sampler for in-context CoT selection and self-prompted inference via in-context learning. Extensive experiments on four multi-hop question-answering benchmarks show that our proposed SP-CoT not only significantly surpasses the previous SOTA methods on large-scale (175B) LLMs, but also nearly doubles the zero-shot performance of small-scale (13B) LLMs. Further analysis reveals the remarkable capability of SP-CoT to elicit direct and concise intermediate reasoning steps by recalling $\sim$50\% of intermediate answers on MuSiQue-Ans dataset.
Multi-VALUE: A Framework for Cross-Dialectal English NLP
Ziems, Caleb, Held, William, Yang, Jingfeng, Dhamala, Jwala, Gupta, Rahul, Yang, Diyi
Dialect differences caused by regional, social, and economic factors cause performance discrepancies for many groups of language technology users. Inclusive and equitable language technology must critically be dialect invariant, meaning that performance remains constant over dialectal shifts. Current systems often fall short of this ideal since they are designed and tested on a single dialect: Standard American English (SAE). We introduce a suite of resources for evaluating and achieving English dialect invariance. The resource is called Multi-VALUE, a controllable rule-based translation system spanning 50 English dialects and 189 unique linguistic features. Multi-VALUE maps SAE to synthetic forms of each dialect. First, we use this system to stress tests question answering, machine translation, and semantic parsing. Stress tests reveal significant performance disparities for leading models on non-standard dialects. Second, we use this system as a data augmentation technique to improve the dialect robustness of existing systems. Finally, we partner with native speakers of Chicano and Indian English to release new gold-standard variants of the popular CoQA task. To execute the transformation code, run model checkpoints, and download both synthetic and gold-standard dialectal benchmark datasets, see http://value-nlp.org.
Open-Source Ground-based Sky Image Datasets for Very Short-term Solar Forecasting, Cloud Analysis and Modeling: A Comprehensive Survey
Nie, Yuhao, Li, Xiatong, Paletta, Quentin, Aragon, Max, Scott, Andea, Brandt, Adam
Sky-image-based solar forecasting using deep learning has been recognized as a promising approach in reducing the uncertainty in solar power generation. However, one of the biggest challenges is the lack of massive and diversified sky image samples. In this study, we present a comprehensive survey of open-source ground-based sky image datasets for very short-term solar forecasting (i.e., forecasting horizon less than 30 minutes), as well as related research areas which can potentially help improve solar forecasting methods, including cloud segmentation, cloud classification and cloud motion prediction. We first identify 72 open-source sky image datasets that satisfy the needs of machine/deep learning. Then a database of information about various aspects of the identified datasets is constructed. To evaluate each surveyed datasets, we further develop a multi-criteria ranking system based on 8 dimensions of the datasets which could have important impacts on usage of the data. Finally, we provide insights on the usage of these datasets for different applications. We hope this paper can provide an overview for researchers who are looking for datasets for very short-term solar forecasting and related areas.
Fast Model Editing at Scale
Mitchell, Eric, Lin, Charles, Bosselut, Antoine, Finn, Chelsea, Manning, Christopher D.
While large pre-trained models have enabled impressive results on a variety of downstream tasks, the largest existing models still make errors, and even accurate predictions may become outdated over time. Because detecting all such failures at training time is impossible, enabling both developers and end users of such models to correct inaccurate outputs while leaving the model otherwise intact is desirable. However, the distributed, black-box nature of the representations learned by large neural networks makes producing such targeted edits difficult. If presented with only a single problematic input and new desired output, fine-tuning approaches tend to overfit; other editing algorithms are either computationally infeasible or simply ineffective when applied to very large models. To enable easy post-hoc editing at scale, we propose Model Editor Networks with Gradient Decomposition (MEND), a collection of small auxiliary editing networks that use a single desired input-output pair to make fast, local edits to a pre-trained model. MEND learns to transform the gradient obtained by standard fine-tuning, using a low-rank decomposition of the gradient to make the parameterization of this transformation tractable. MEND can be trained on a single GPU in less than a day even for 10 billion parameter models; once trained MEND enables rapid application of new edits to the pre-trained model. Our experiments with T5, GPT, BERT, and BART models show that MEND is the only approach to model editing that produces effective edits for models with tens of millions to over 10 billion parameters. Increasingly large neural networks have become a fundamental tool in solving data-driven problems in computer vision (Huang et al., 2017) and natural language processing (Vaswani et al., 2017) in particular. However, a key challenge in deploying and maintaining such models is issuing patches to adjust model behavior after deployment (Sinitsin et al., 2020).